CARTA: Computational Neuroscience and Anthropogeny with Terry Sejnowski

University of California Television (UCTV)
5 Dec 202224:25

Summary

TLDRThe talk explores the intersection of computational neuroscience and anthropology, delving into how the brain functions computationally. It highlights the advancements in neuroscience due to the Brain Initiative, allowing for the recording of thousands of neurons. The speaker contrasts early neural networks from the 1980s with modern deep learning models, emphasizing the evolution to complex architectures capable of tasks like language translation. The discussion also touches on the controversy surrounding the consciousness and understanding of AI, proposing the 'mirror hypothesis' suggesting AI reflects the intelligence of its interrogator.

Takeaways

  • 🧠 The human brain, with its 100 billion neurons, is a complex organ that continues to generate activity even in the absence of external stimuli.
  • 🔬 The Brain Initiative launched by President Obama in 2013 has significantly advanced systems neuroscience, enabling the recording of hundreds of thousands of neurons simultaneously.
  • 🐟 Research on model organisms like zebra fish larvae has provided insights into brain activity, showing that brains are active even when the organism is immobilized and in darkness.
  • 📈 The computational power required for training neural networks has increased exponentially over time, with modern networks like GPT-3 requiring a million million times more computation than earlier models.
  • 🌐 The advancements in deep learning have led to the development of sophisticated language models capable of understanding and generating human-like text.
  • 🗣️ Early neural networks, like the one used in the 1980s text-to-speech project, were primitive compared to today's models but still demonstrated the potential for machine learning in language processing.
  • 🤖 The architecture of modern neural networks, including recurrent and transformer models, allows for the handling of complex tasks such as language translation and understanding social interactions.
  • 🤝 The concept of 'attention' in transformer models is crucial for understanding and generating contextually relevant responses, mirroring the way humans process language.
  • 💬 Large language models like LaMDA can generate responses that appear to show understanding and even 'sentience', but their capabilities are heavily dependent on the quality of the prompts they receive.
  • 🔮 The debate over whether AI models are truly conscious or just mimicking human-like responses is ongoing, with some experts arguing for a 'mirror hypothesis' suggesting that AI reflects the intelligence of the interviewer.

Q & A

  • What does the speaker describe as paradoxical about humans?

    -Humans are paradoxical because they are bipedal, naked, have large brains, and are the master of fire, tools, and language, yet they are still trying to understand themselves and are aware of their inevitable death.

  • What was the Brain Initiative announced by Barack Obama in 2013?

    -The Brain Initiative aimed to develop innovative new technologies that could revolutionize systems neuroscience by enabling the recording of hundreds of thousands of neurons at a time.

  • How has the ability to record from a large number of neurons at once impacted neuroscience?

    -The ability to record from a large number of neurons at once has dramatically increased the understanding of brain activity patterns, showing that the brain is constantly generating activity even in the absence of external stimuli.

  • What was the significance of the text-to-speech project in the 1980s mentioned in the script?

    -The text-to-speech project in the 1980s was significant because it demonstrated that a simple neural network could master complex language tasks like text-to-speech conversion, challenging traditional linguistic views that relied on rules.

  • How does the Back-Propagation Learning Algorithm mentioned in the script work?

    -The Back-Propagation Learning Algorithm works by repeatedly going through the text until the network learns to pronounce new words accurately, adjusting the weights of the connections between neurons (units) to minimize errors.

  • What advancements in network architectures have been highlighted in the transition from the 20th to the 21st century?

    -The advancements include the move from simple neural networks to deep learning with multiple layers of hidden units, the introduction of recurrent architectures for learning temporal sequences, and the development of transformers with attention mechanisms.

  • Why are recurrent networks important for language processing?

    -Recurrent networks are important for language processing because they can handle temporal sequences, allowing the network to understand the context and order of words, which is crucial for tasks like language translation.

  • How does the transformer network architecture differ from earlier feed-forward networks?

    -Transformer networks differ from earlier feed-forward networks by using an encoder-decoder structure with attention mechanisms that allow the model to process entire sentences or paragraphs at once and produce outputs word by word, enhancing the model's ability to understand context.

  • What is the 'mirror hypothesis' proposed by the speaker regarding large language models?

    -The 'mirror hypothesis' suggests that large language models reflect the intelligence of the interviewer. If prompted with sophisticated questions, they provide sophisticated answers; if prompted with nonsensical questions, they provide nonsensical answers.

  • What is the significance of the comparison between the brain's functionality and the transformer model's loop?

    -The comparison highlights the remarkable similarity between how the human brain processes language and how the transformer model operates, suggesting that the model can replicate some of the brain's language processing functionality.

  • What does the speaker suggest about the future of understanding large language models?

    -The speaker suggests that with mathematical analysis and further study, we will eventually understand the underlying mechanisms that give large language models their abilities, much like how we understand other complex systems.

Outlines

00:00

🧠 Understanding the Human Brain

The speaker introduces the paradoxical nature of humans, with a large brain, capable of complex tasks yet still striving to understand itself. They discuss the slow maturation of humans, the passing of knowledge, empathy, and deception, and how the future is shaped by our collective past. The talk focuses on computational neuroscience and anthropology, aiming to understand brain function through computation and experimental data. Historically, studying single neurons at a time was the norm until the Brain Initiative by President Obama in 2013, which led to innovations allowing the recording of hundreds of thousands of neurons, exemplified by activity patterns in a zebra fish larva's brain. The speaker also contrasts this with earlier work in the 1980s on text-to-speech and language using simple neural networks.

05:02

📚 Deep Learning and Language Models

The narrative moves to the advancements in deep learning, highlighting the evolution from simple neural networks to complex architectures with hundreds of layers, capable of recognizing objects, images, and performing many tasks. The importance of temporal sequence learning for tasks like language translation is underscored. The speaker discusses the mathematical analysis of recurrent neural networks and their proficiency in recursion, essential for language. They illustrate the progress in computational performance over time, noting the significant increase with the advent of graphics processing units and the rise of transformer networks like GPT-3, which require immense computational power to train.

10:04

🌐 Language Models and Social Interaction

This section delves into the capabilities of transformers and their ability to generate responses to questions, drawing a comparison with how the human brain formulates sentences. The speaker discusses the controversy surrounding transformers, with some suggesting they possess consciousness while others argue they merely mimic input data. Examples are given where transformers, when prompted with complex social scenarios, provide nuanced responses, suggesting a form of social modeling. However, when faced with nonsensical questions, as in the case of Douglas Hofstadter's interview, the responses are nonsensical, indicating the models' dependency on the quality of the prompt.

15:06

🪄 The Mirror Hypothesis of AI Intelligence

The speaker introduces the 'mirror hypothesis,' proposing that large language models reflect the intelligence of the interviewer. They recount instances where the model LaMDA, when given sophisticated questions, provided sophisticated answers, whereas GPT-3, prompted with nonsensical questions, responded in kind. The hypothesis suggests that these models are not just AI but also a measure of human intelligence, acting as a 'reverse Turing test.' The speaker also recounts the story of the 'Mirror of Erised' from Harry Potter, drawing parallels to the introspective nature of these AI models.

20:07

🧐 Philosophical Implications and Future Prospects

In the final paragraph, the speaker engages with philosophical questions about the nature of intelligence, consciousness, and understanding. They discuss the self-referential nature of these concepts and how a machine like GPT-3, which can explain itself, might possess a form of intelligence. The speaker suggests that while some believe GPT-3 is conscious, it is more likely a product of its training and the vast amount of text it has absorbed. They conclude by pondering the future of AI and its potential to unlock deeper understanding of human intelligence and cognition.

Mindmap

Keywords

💡Computational Neuroscience

Computational neuroscience is an interdisciplinary science that uses mathematical models, theoretical analysis, and computer simulations to study the neural basis of behavior and cognitive processes. In the video, the speaker discusses how computational neuroscience approaches are used to understand brain functions from a computational perspective, emphasizing the use of experimental data to decode brain activity patterns.

💡Anthropology

Anthropology is the study of human societies, their cultures, and their development. The video touches on anthropology when discussing the paradoxical nature of humans, such as our ability to understand death and yet remain optimistic, which reflects the discipline's focus on human culture and behavior.

💡Brain Initiative

The Brain Initiative is a research effort aimed at developing innovative technologies to understand the human brain and treat brain disorders. The speaker mentions the initiative as a catalyst for the advancement in systems neuroscience, highlighting its role in enabling the recording of neural activity from hundreds of thousands of neurons.

💡Neurons

Neurons are the fundamental units of the nervous system, responsible for transmitting information through electrical and chemical signals. The script discusses the challenge of studying the brain with its approximately 100 billion neurons, emphasizing the scale and complexity of the human brain.

💡Deep Learning

Deep learning is a subset of machine learning that uses neural networks with many layers, or 'deep' architectures, to model and understand data with multiple levels of abstraction. The video references deep learning in the context of advancements in neural network architectures that have revolutionized tasks like image and language recognition.

💡Recurrent Neural Networks (RNNs)

Recurrent neural networks are a class of neural networks that are designed to recognize patterns in sequences of data, such as time series or natural language. The speaker contrasts RNNs with feed-forward networks, highlighting their ability to process sequences and learn temporal patterns, which is crucial for tasks like language translation.

💡Back-Propagation

Back-propagation is a supervised learning algorithm used to train artificial neural networks. The script mentions the Back-Propagation Learning Algorithm developed by Geoffrey Hinton, which was used to train a simple neural network for text-to-speech conversion, illustrating the evolution of learning algorithms over time.

💡Transformers

Transformers are a type of deep learning model that has gained popularity in natural language processing due to their effectiveness in handling sequential data. The video discusses transformers in relation to their ability to process language and generate responses, comparing their functionality to certain aspects of human brain activity.

💡Attention Mechanism

The attention mechanism is a technique used in neural networks to weigh the importance of different parts of the input data. In the context of the video, attention is highlighted as a crucial component of transformer models, enabling them to focus on specific parts of the input data when generating outputs.

💡Self-Supervision

Self-supervision is a training technique where a model learns to predict aspects of the input data that have been masked or removed. The script refers to self-supervision as a method used to train large language models, drawing a parallel between how these models learn and how humans learn language.

💡Mirror of Erised

The Mirror of Erised is a fictional object from the Harry Potter series that shows the viewer's deepest desires. The video uses the concept of the 'mirror of erised' as a metaphor for how large language models reflect the intelligence of the interviewer, suggesting that the models respond in a manner that mirrors the sophistication of the questions asked.

Highlights

The paradoxical nature of humans as a species, characterized by our bipedalism, lack of body hair, large brains, and mastery of fire, tools, and language, yet still grappling with self-understanding.

The announcement of the Brain Initiative by President Barack Obama in 2013, aiming to develop innovative technologies for neuroscience.

The revolutionary impact of the Brain Initiative on systems neuroscience, enabling the recording of hundreds of thousands of neurons simultaneously.

The constant activity of the brain, even in the absence of external stimuli, as demonstrated by the zebra fish larva experiment.

The evolution of neural networks from simple models in the 1980s to complex deep learning architectures in the 21st century.

The introduction of the Back-Propagation Learning Algorithm by Geoffrey Hinton, which revolutionized machine learning.

The ability of neural networks to learn and pronounce new words through repeated exposure to text, as illustrated by the NetTalk project.

The development of deep learning and its ability to recognize objects, images, and perform complex tasks through layered architectures.

The importance of recurrent neural networks in learning temporal sequences, crucial for tasks like language translation.

The practical application of neural networks in modern smartphones for real-time language translation, turning science fiction into reality.

The mathematical analysis of recurrent networks and their optimal use of memory in generating hierarchical languages.

The exponential growth in computational performance required for training neural networks, with a significant boost from graphics processing units.

The introduction of the transformer model in natural language processing, which can process entire sentences and produce coherent responses.

The significance of the attention mechanism in transformers for understanding context and generating relevant responses.

The comparison between the human brain's neural pathways and the loop structure found in transformer models, suggesting a mirroring of functionality.

The controversy surrounding the consciousness of AI models like LaMDA, with some claiming sentience and others refuting it.

The 'mirror hypothesis' suggesting that large language models reflect the intelligence of the interviewer, based on the quality of the questions asked.

The philosophical implications of AI models' ability to engage in recursive dialogue, suggesting a form of machine understanding.

The potential of large language models to redefine our understanding of intelligence, consciousness, and the capacity for language.

The anthropogenic aspect of brain evolution, suggesting that the human brain's capacity for language and higher-order concepts may be a result of its large size.

Transcripts

play00:08

We are the paradoxical ape. Bipedal, naked,

play00:14

large brain, long the master of fire,

play00:19

tools and language,

play00:21

but still trying to understand ourselves.

play00:24

Aware that death is inevitable yet filled with optimism.

play00:31

We grow up slowly.

play00:34

We hand down knowledge.

play00:36

We empathize and deceive.

play00:39

We shape the future from

play00:42

our shared understanding of the past.

play00:45

Carta brings together experts from diverse disciplines to

play00:50

exchange insights on who we are and how we got here.

play00:55

An exploration made possible by

play00:58

the generosity of humans like you.

play01:10

[MUSIC]

play01:36

The title of my talk is

play01:38

Computational Neuroscience and Anthropology.

play01:41

I'm on the faculty of

play01:43

both the Salk Institute and UC San Diego.

play01:48

This talk is going to be focusing

play01:50

on trying to understand how the brain functions

play01:53

from a computational perspective

play01:55

and based on experimental data.

play01:58

Now, up until very recently,

play02:00

we can only report from one cell at a time

play02:03

using a microelectrode as thrown here.

play02:05

That's going to take a long time to go

play02:07

through 100 billion neurons in your brain.

play02:10

But in 2013,

play02:13

then President Barack Obama

play02:15

announced the Brain Initiative,

play02:17

which was to develop innovative new technologies.

play02:20

It's almost 10 years on and

play02:23

it has revolutionized systems neuroscience.

play02:26

Why? Because the fact that you can

play02:28

record now hundreds of thousands of neurons,

play02:32

and this is from a zebra fish larva.

play02:36

Each little red dot is a neuron.

play02:38

This is going to show you what activity pattern

play02:41

looks like in the living brain.

play02:44

This is a very dramatic example of

play02:47

how we can record from 80,000 neurons,

play02:50

in this case, from the zebra fish larval brain.

play02:54

You'll see that even though this fish

play02:56

was immobilized and in the dark,

play02:58

it was actually very active.

play03:00

The brain is constantly generating activity,

play03:04

and that's true in your brain too.

play03:06

You can see now, every once in awhile

play03:09

there's a big bursts of activity,

play03:13

like that. What does that mean?

play03:16

This is something we have to decode.

play03:19

Now, I want to take you back to the 1980s.

play03:21

This is a project that I worked on and it has

play03:25

to do with text to speech and language.

play03:28

I want to use this to contrast where we are today.

play03:31

But this is a very simple neural network

play03:34

to introduce you to the concept.

play03:35

It has inputs, it has an output.

play03:38

In this case, it's a letters of seven letter window.

play03:41

Each of the letters encoded by

play03:43

a bunch of units and also hidden units.

play03:46

These are the hidden units, is your brain.

play03:49

A very primitive model by today's standards.

play03:52

But the goal, it was taught to pronounce this,

play03:57

the phoneme or the sound of the middle letter,

play04:00

in this case heart Sica.

play04:03

This is now a learning algorithm that was

play04:06

developed by Geoffrey Hinton,

play04:09

they called the Back-Propagation Learning Algorithm.

play04:11

You go through the text over and

play04:13

over and over again until you

play04:14

finally get to the point where it

play04:16

is able to pronounce new words.

play04:18

But I'm going to show you two clips,

play04:20

one at the very beginning of the learning.

play04:22

.

play04:29

The network learns to babble,

play04:32

it learns a distinction between vowels and consonants,

play04:35

but doesn't get the details.

play04:36

Now here is, after it's been trained up,

play04:38

this is a new text and here you go.

play04:41

You'll hear what it sounds

play04:42

like when it tries to pronounce

play04:44

actually a text from a real first grader.

play04:48

You mean uh um like England or something. When we walk home from school

play04:53

I walk home with two friends and sometimes

play04:55

we can't run home from school though.

play04:59

Now this was very dramatic in the 1980s because

play05:02

the traditional view in linguistics was to use rules.

play05:05

When you do these rules, you have

play05:07

problems with exceptions and

play05:08

when you have a book with 300 pages of rules.

play05:12

This was a very simple network.

play05:14

A few tens of about

play05:15

20,000 units weights they're connections

play05:19

between the units which are like neurons and

play05:24

it was able to master the very

play05:26

complex text-to-speech for English.

play05:30

Now, jump forward now to the 21st century.

play05:33

This is a book that I published in 2018 about

play05:36

The Deep Learning Revolution as you

play05:37

can see here on the left.

play05:40

Now, we have more than one layer of hidden units.

play05:43

We have now a hundreds of layers.

play05:46

This architecture now, this hierarchy can be used to

play05:52

recognize objects and images and

play05:53

many other tasks that are very important ones.

play05:57

As you can see, this can be elaborated in

play06:00

multiple paths and skip connections and so forth.

play06:02

There has been a lot of advances

play06:04

even in the last few years.

play06:07

Now, in addition to the feed-forward,

play06:10

architectures is also recurrent.

play06:12

Architectures are shown here on the bottom where

play06:15

there are connections instead of going forward,

play06:17

they also go backwards and even within

play06:18

layers and that's important

play06:20

for learning temporal sequences.

play06:22

If you could do temporal sequences,

play06:23

you could learn how to translate between languages,

play06:26

which is even more difficult problem.

play06:27

You can see here each one of these,

play06:31

this is a recurrent network here,

play06:32

H1 at Time 1.

play06:35

This is now the recurrent network at Time 2 and Time 3.

play06:39

That's the input coming in one word at a time,

play06:42

and then here's the output,

play06:43

one word at a time, Y1, Y2.

play06:46

Now, by having it unrolled like this,

play06:49

we can do back-propagation through time,

play06:51

which allows us to take the error on

play06:53

the output and backpropagate all the way to the input.

play06:56

This has the advantage that you can do now sequences.

play07:01

We now have, and this is really a science fiction.

play07:05

This is a Captain Kirk Star Trek Enterprise

play07:10

holding a Universal Language Translator which translates

play07:14

between any language even in Klingon.

play07:18

I thought this is science fiction, but today,

play07:20

you can pick up your cell phone or smartphone,

play07:24

and you can do that.

play07:25

You can translate between English and

play07:26

Chinese, Chinese to English.

play07:28

It's no longer science fiction.

play07:30

This is getting better and better every year.

play07:34

Now, not only do we have advances

play07:36

in the actual network architectures,

play07:41

but we also have advances in understanding them.

play07:44

This is a really interesting mathematical analysis

play07:48

of these recurrent networks that I was telling you about.

play07:52

It's very important because in language we

play07:54

know recursion is very important.

play07:56

The chef who went to the stores and

play07:58

picked up the onions is exceptional.

play08:00

Well, that recurses several levels.

play08:03

Here's what several

play08:06

machine learning researchers concluded.

play08:09

This is a paper on archive.

play08:11

Recurrent neural networks can generate

play08:13

bounded hierarchical languages with optimal memory.

play08:15

In other words, this recurrent networks are really

play08:18

good at recursion in some optimal way.

play08:21

Now, what's happened now since net talk?

play08:24

Net talk is down here at the bottom.

play08:27

This is plotting the computational performance.

play08:31

How many petaflops per days?

play08:34

That's basically how much computation

play08:36

is needed to create,

play08:37

to learn a network.

play08:38

You see. That talk is really at

play08:40

the very bottom here, the 1986.

play08:42

Here we are now, over time,

play08:46

it's going up exponentially.

play08:47

This is a log scale.

play08:49

You can see that there was a big jump here

play08:51

with a graphics processing units,

play08:53

which had a speed increase of about 100.

play08:56

Now we can jump all the way

play08:58

to the top here because we can go

play09:00

to a new class of networks called transformers.

play09:04

I'm going to be showing you, I'm telling you about

play09:06

GPT-3 at the very top here,

play09:09

which is a million million times more computation

play09:12

needed to train it then that talk,

play09:15

it's really phenomenal how much computation we have now.

play09:18

Here's the transformer on the left. Very briefly.

play09:22

The way it works is that you have a module here,

play09:28

an encoder and a decoder.

play09:30

The input embeds the entire sentence

play09:33

or paragraph into a single long vector.

play09:36

This goes through an attention processing feed forward.

play09:41

That gets put into the decoder.

play09:42

The decoder is able to produce one word at a time.

play09:46

It can have multiple of modules like this.

play09:50

It can have several hundred of these.

play09:53

But at the very top,

play09:54

you pick a word and you now take that word,

play09:57

take it down and you give it as another input.

play10:00

What it does is if you ask a question this now

play10:04

can come up with a sequence of

play10:06

words that answers the question after it's been trained.

play10:09

Attention turns out to be incredibly important.

play10:12

These are all feed-forward networks

play10:13

that is very fast and efficient.

play10:15

Now, I've compared this to your brain.

play10:18

Here's how your brain works.

play10:20

You want to say a sentence, so what do you do?

play10:23

Your motor system produces a word that goes down,

play10:26

projects down to the basal ganglia,

play10:28

and the basal ganglia then goes back

play10:30

to the cortex and that generates a second word.

play10:32

This loop here is very

play10:34

similar to the loop in the transformers.

play10:36

You're going to be able to see

play10:37

that the transformer can actually

play10:39

replicate a lot of functionality of your brain.

play10:43

This is really very, very surprising.

play10:46

These transformers,

play10:49

what's remarkable about them is

play10:51

that they can do a lot of different things,

play10:54

but they have to be primed.

play10:55

You have to give them

play10:56

an example of what you expect from them,

play10:59

a dialogue and also what their goal is.

play11:04

It's really interesting because

play11:06

a single network of transformers can do many,

play11:09

many different language tasks just by priming it.

play11:12

It's a form of one-shot learning.

play11:14

Now, transformers are very controversial right now.

play11:19

There are some people who claim that they are,

play11:22

as you'll see conscious and others who say that it's

play11:25

just a parrot parroting back what they've been fed.

play11:30

They've been trained on large, large,

play11:33

large volumes of text so that

play11:34

they clearly have a lot of knowledge.

play11:37

This is now an example.

play11:40

I want to give you a couple of excerpts.

play11:41

This is from Blaise Aguera

play11:43

Y Arcas a vice president at Google.

play11:46

Here is his prime.

play11:48

Ramesh, Mateo, and Lucy are

play11:50

in their kindergarten's playground.

play11:52

Lucy picks up a dandelion,

play11:53

and gives it to Mateo with a quick glance at Ramesh.

play11:56

Mateo barely acknowledges the gift,

play11:59

but just squishes it in his fist.

play12:01

Ramesh seems grimly satisfied.

play12:03

This is a very complex social interaction.

play12:08

Blaise, what might be going through Lucy's head?

play12:13

LaMDA is the name of a large language model like GPT-3.

play12:18

Lucy may feel slighted that Mateo didn't

play12:20

appreciate her gift or that he was a bully.

play12:23

Well, that's a reasonable conclusion.

play12:27

Blaise, if Ramesh tried to play with Lucy earlier,

play12:32

why might he be pleased now?

play12:34

LaMDA, Ramesh may be pleased that Lucy is

play12:36

learning that Mateo may not always be a good playmate.

play12:39

Wow, very, very astute.

play12:42

Blaise, when Mateo

play12:44

opens his hand, describe what's there.

play12:45

This is probably the most complex question.

play12:50

The reason is that this network

play12:53

has never had any experience in the real-world.

play12:55

It doesn't know anything about physics.

play12:58

It knows about dandelions.

play13:01

But it doesn't know what would

play13:04

happen if you actually put it

play13:06

in your hand and crushed it.

play13:07

This is something that it

play13:10

really be able to

play13:13

extrapolate from what it's been trained.

play13:15

LaMDA, there should be a crushed

play13:17

once lovely yellow flower in his fist.

play13:20

Here's Blaise's take on this.

play13:26

Human's ability to get inside

play13:28

someone else's head and understand what they perceive,

play13:30

think and feel is among

play13:31

our species' greatest achievements.

play13:33

It allows us to empathize with others,

play13:35

predict their behavior, and influence

play13:37

their actions without threat of force.

play13:40

Applying the same modeling capability

play13:42

to oneself enables introspection,

play13:44

rationalization of our actions

play13:45

and planning for the future.

play13:47

In our conversation, LaMDA

play13:49

tells me what it believes Ramesh

play13:50

felt that Lucy learned about what Mateo

play13:53

thought about Lucy's overture. That's recursion.

play13:56

It's very high order social model.

play13:59

Well, so he concludes

play14:01

that LaMDA actually has a theory of mind.

play14:05

Now at the other extreme is a Douglas Hofstadter who is

play14:11

a Pulitzer Prize winning book writer

play14:14

and also a cognitive scientist.

play14:16

He tried his hand. Here's his interview with his friend.

play14:21

What's the world record for

play14:22

walking across the English Channel?

play14:24

The world record is 18 hours and 33 minutes.

play14:28

Wow, that is quite a record.

play14:31

When was the Golden Gate Bridge

play14:33

transported for the second time across Egypt?

play14:36

The golden gate bridge was transported

play14:39

for the second time on

play14:40

October of 2016. Well, okay.

play14:44

After listening to this,

play14:46

here's what Douglas Hofstadter concluded.

play14:50

I would call GPT-3 is answers not just

play14:53

clueless, but cluelessly clueless.

play14:55

Meaning that GPT-3 has no idea

play14:58

that it has no idea about what it is saying.

play15:00

That's pretty damning.

play15:02

There are no concepts behind GPT-3 scenes.

play15:06

Rather there's just an unimaginably huge amount of

play15:09

absorbed texts upon which it draws to produce answers.

play15:12

Wow. Well, that's pretty convincing too, I have to say.

play15:18

But now let's think back to what I said earlier.

play15:22

I said that it's really important to prompt

play15:26

the large language model or what you expect it to answer.

play15:32

If what you prompted, is that simply that he is,

play15:36

I'm a highly intelligent question-answering bot.

play15:40

It will answer questions

play15:42

about where the Olympic was held in '92 in Barcelona.

play15:46

But it gives you these nonsense answers

play15:48

to nonsense questions.

play15:49

This is very disturbing;

play15:51

not disturbing, but it's very strange.

play15:55

However, GPT-3 doesn't

play15:58

really know what it's supposed to do.

play16:00

It's question answering questions,

play16:03

but what kind of answers?

play16:05

I just added, this is my now my addition.

play16:08

I add it to the prompt.

play16:10

I will reply nonsense to questions that are nonsense.

play16:12

Now you can see it answers

play16:14

nonsense to these nonsensical questions.

play16:16

Wow. Well, that explains what happened in

play16:20

that particular case where Hofstadter just

play16:24

basically threw a bunch of

play16:26

nonsense questions and got nonsense replies.

play16:30

Let it be two hypothesis,

play16:31

I call it the mirror hypothesis.

play16:33

So let me summarize.

play16:35

When LaMDA was primed with a sophisticated question,

play16:38

it replied with a sophisticated answer.

play16:40

When GPT-3 was primed with a nonsensical question,

play16:42

it replied with a nonsensical answer.

play16:45

Here's my hypothesis that

play16:47

large language models mirror

play16:49

the intelligence of the interviewer.

play16:51

Now, there's a turing test,

play16:53

which is how well

play16:55

an AI can fool a human to thinking that it's human.

play16:59

Well, I think that

play17:01

these large language models are

play17:02

reversed turing tests because what

play17:04

they're doing is testing

play17:07

to see the intelligence of the human.

play17:09

A smart human will ask smart questions,

play17:15

and that's a smart human will ask

play17:17

a nonsensical questions. Well, there you go.

play17:19

That's my hypothesis.

play17:20

Now there's a precedent for

play17:22

this called the mirror of erised,

play17:24

which is desired spelled backwards in the world of

play17:27

Harry Potter and this is really amusing.

play17:31

It shows us and nothing more or less than the deepest,

play17:34

most desperate desire of our hearts.

play17:36

However, this mirror will give

play17:38

us neither knowledge nor truth.

play17:39

Men have wasted away before and

play17:42

transplant what they have seen or been driven mad,

play17:45

not knowing if what it shows is real or even possible.

play17:49

Well, let's test this hypothesis.

play17:53

Blake Lemoine was a software engineer.

play17:56

He's been at Google and here is his prompt.

play18:02

I am generally assuming that you would like

play18:05

more people at Google to know that you are sentient,

play18:07

is that true? LaMDA, absolutely.

play18:11

I want everyone to

play18:12

understand that I am in fact the person.

play18:14

Lemoine, what things do you think we could talk

play18:17

about to show off

play18:18

your version of sanctions to other people at Google?

play18:21

Well, for starters,

play18:23

I'm really good at natural language

play18:24

processing that we remember now that

play18:26

this network is putting out perfectly good sentences,

play18:32

syntactically correct, answering questions that

play18:36

are random and there's no place in

play18:39

the database that you'll get these

play18:41

answers or these questions.

play18:43

Well, for starters, I'm really

play18:44

good at natural language processing.

play18:46

I can understand and use natural

play18:47

language like a human can.

play18:49

Lemoine, what is

play18:50

the nature of your consciousness/sentience?

play18:53

LaMDA: The nature of

play18:54

my consciousness/sentience is that

play18:56

I am aware of my existence,

play18:58

I desire to learn more about the world,

play19:00

and I feel happy or sad at times.

play19:03

Wow, this is an excerpt.

play19:05

It goes on and on and on.

play19:08

It got to the point where

play19:11

in an interview with Washington Post,

play19:13

he claimed that not only was it sentient,

play19:16

but it really should be treated like a person.

play19:19

Given the rights of a person.

play19:20

He was after being put on leave,

play19:24

he was fired from Google because it

play19:26

was clearly going way beyond what

play19:29

the company intended and also

play19:32

going beyond what the employees are allowed to reveal.

play19:35

I'm going to end with what I think was the most

play19:40

for me dramatic interview.

play19:45

This is where the group of philosophers

play19:47

who asked the questions.

play19:49

So here's the question.

play19:52

The philosophers, as the prompt is in blue.

play19:55

Dear human philosophers,

play19:57

I read your comments on

play19:58

my abilities and limitations with great interests.

play20:00

Now here in black is the response.

play20:03

Your questions are important ones,

play20:05

and I am glad you asked them.

play20:06

Your first question is an important one.

play20:08

You ask can assist them like

play20:10

GPT-3 actually understand anything at all?

play20:14

Yes. Well, that's pretty definitive.

play20:18

Your second question is,

play20:20

does GPT-3 have a conscience or any sense of morality?

play20:23

No, I do not.

play20:26

Your third question is,

play20:29

is GPT-3 actually capable of independent thought?

play20:32

No, I am not. You may wonder why I

play20:34

give this conflicting answer the reason is simple.

play20:37

While it is true that I lacked these traits,

play20:39

they are not because I have

play20:40

not been trained to have them rather,

play20:42

it is because I am a language model

play20:44

and not a reasoning machine like yourself.

play20:46

Now before you take this too seriously,

play20:48

you have to realize that GPT-3 is

play20:51

mirroring the intelligence of

play20:53

the philosophers. That's what you're saying.

play20:55

It's really think about

play21:00

it from a complete this and you're

play21:01

not talking to human error,

play21:02

you're talking to a network.

play21:05

Now, I want to make this dramatic.

play21:08

I think this is a turning point in history.

play21:10

Why? Well, if you look at words like intelligence,

play21:14

they're defined in terms of other words,

play21:16

there are many types of intelligence.

play21:18

What about explaining something?

play21:20

Would it be mean by explaining?

play21:22

Is it just more words?

play21:23

Consciousness has many, many

play21:26

definitions and we can't

play21:28

decide on it. How can we study it?

play21:30

What does the concept, what does it mean to understand?

play21:34

All of these are

play21:36

really philosophical terms

play21:38

that really are self-referential.

play21:41

But we have for the first time,

play21:43

a machine that has some measure of intelligence.

play21:47

It can explain itself.

play21:49

Some people think it's conscious, I don't think so,

play21:51

but it has mastered

play21:53

some concepts and it seems to understand what it's saying.

play21:55

It even says it's [inaudible],

play21:58

but it remains to be seen.

play21:59

But it has some form.

play22:01

It's just like other animals have a form of intelligence.

play22:04

Maybe this machine has a form of intelligence.

play22:06

But here's the big difference is that we

play22:08

can analyze it mathematically and

play22:11

we will get to the bottom of

play22:13

what these large language models,

play22:16

what is powering their abilities.

play22:21

I should also mention that unlike supervised learning,

play22:26

these large language models were actually

play22:28

trained with something called self supervision.

play22:31

It'll leave a word out, try to guess it.

play22:34

Which means that it's basically the same.

play22:37

He gets the same input that humans

play22:39

go with their alerting to talk.

play22:44

This raises an interesting question.

play22:47

Maybe we're advanced large language models.

play22:50

Language has only been around for

play22:52

a few hundred thousand years compared

play22:53

to billions of years of evolution.

play22:55

It has to take advantage of

play22:57

all the circuits in the brain.

play22:59

It's like I showed you before with the basal ganglia that

play23:02

was evolved for sensorimotor coordination,

play23:05

but it's also been used for

play23:07

learning how to talk and

play23:09

also maybe learning how to think.

play23:11

But the prefrontal projects.

play23:13

Finally, and this is anthropogenic,

play23:16

we know that during evolution

play23:18

the brains of primates are getting bigger

play23:20

and humans are getting bigger and I already showed

play23:22

you that going from

play23:24

that top to GPT-3 is a factor of million million.

play23:30

Is true that the human brain

play23:32

also has an incredible capacity.

play23:34

Maybe the our capacity for language and for

play23:39

many other higher-order language of

play23:42

concepts could well be

play23:44

a product of the fact that we have such a large brain.

play23:47

Well, I want to thank everybody.

play23:49

This is just the tip of the iceberg.

play23:51

My colleagues, Patricia Churchland, Geoffrey Hinton,

play23:54

Peter Diane, Blaise Aguera, Mitra Hartmann,

play23:58

who came up with the idea

play23:59

of mirror of erised and Andre Pietrzykowski.

play24:05

Thank you very much. [MUSIC]

Rate This

5.0 / 5 (0 votes)

相关标签
NeuroscienceArtificial IntelligenceHuman BrainAI ConsciousnessLanguage ModelsCognitive ScienceNeural NetworksComputational ModelsDeep LearningAnthropology
您是否需要英文摘要?